{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Getting started with TensorFlow (Eager Mode)\n",
    "\n",
    "**Learning Objectives**\n",
    "  - Understand difference between Tensorflow's two modes: Eager Execution and Graph Execution\n",
    "  - Practice defining and performing basic operations on constant Tensors\n",
    "  - Use Tensorflow's automatic differentiation capability\n",
    "\n",
    "## Introduction\n",
    "\n",
    "**Eager Execution**\n",
    "\n",
    "Eager mode evaluates operations immediatley and return concrete values immediately. To enable eager mode simply place `tf.enable_eager_execution()` at the top of your code. We recommend using eager execution when prototyping as it is intuitive, easier to debug, and requires less boilerplate code.\n",
    "\n",
    "**Graph Execution**\n",
    "\n",
    "Graph mode is TensorFlow's default execution mode (although it will change to eager with TF 2.0). In graph mode operations only produce a symbolic graph which doesn't get executed until run within the context of a tf.Session(). This style of coding is less inutitive and has more boilerplate, however it can lead to performance optimizations and is particularly suited for distributing training across multiple devices. We recommend using delayed execution for performance sensitive production code. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import tensorflow as tf\n",
    "print(tf.__version__)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Eager Execution"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "tf.enable_eager_execution() "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Adding Two Tensors \n",
    "\n",
    "The value of the tensor, as well as its shape and data type are printed"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "a = tf.constant(value = [5, 3, 8], dtype = tf.int32)\n",
    "b = tf.constant(value = [3, -1, 2], dtype = tf.int32)\n",
    "c = tf.add(x = a, y = b)\n",
    "print(c)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Overloaded Operators\n",
    "We can also perform a `tf.add()` using the `+` operator. The `/,-,*` and `**` operators are similarly overloaded with the appropriate tensorflow operation."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "c = a + b # this is equivalent to tf.add(a,b)\n",
    "print(c)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### NumPy Interoperability\n",
    "\n",
    "In addition to native TF tensors, tensorflow operations can take native python types and NumPy arrays as operands. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np \n",
    "\n",
    "a_py = [1,2] # native python list\n",
    "b_py = [3,4] # native python list\n",
    "\n",
    "a_np = np.array(object = [1,2]) # numpy array\n",
    "b_np = np.array(object = [3,4]) # numpy array\n",
    "\n",
    "a_tf = tf.constant(value = [1,2], dtype = tf.int32) # native TF tensor\n",
    "b_tf = tf.constant(value = [3,4], dtype = tf.int32) # native TF tensor\n",
    "\n",
    "for result in [tf.add(x = a_py, y = b_py), tf.add(x = a_np, y = b_np), tf.add(x = a_tf, y = b_tf)]:\n",
    "    print(\"Type: {}, Value: {}\".format(type(result), result))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "You can convert a native TF tensor to a NumPy array using .numpy()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "a_tf.numpy()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Linear Regression\n",
    "\n",
    "Now let's use low level tensorflow operations to implement linear regression.\n",
    "\n",
    "Later in the course you'll see abstracted ways to do this using high level TensorFlow."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Toy Dataset\n",
    "\n",
    "We'll model the following function:\n",
    "\n",
    "\\begin{equation}\n",
    "y= 2x + 10\n",
    "\\end{equation}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "X = tf.constant(value = [1,2,3,4,5,6,7,8,9,10], dtype = tf.float32)\n",
    "Y = 2 * X + 10\n",
    "print(\"X:{}\".format(X))\n",
    "print(\"Y:{}\".format(Y))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Loss Function\n",
    "\n",
    "Using mean squared error, our loss function is:\n",
    "\\begin{equation}\n",
    "MSE = \\frac{1}{m}\\sum_{i=1}^{m}(\\hat{Y}_i-Y_i)^2\n",
    "\\end{equation}\n",
    "\n",
    "$\\hat{Y}$ represents the vector containing our model's predictions:\n",
    "\\begin{equation}\n",
    "\\hat{Y} = w_0X + w_1\n",
    "\\end{equation}"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def loss_mse(X, Y, w0, w1):\n",
    "    Y_hat = w0 * X + w1\n",
    "    return tf.reduce_mean(input_tensor = (Y_hat - Y)**2)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Gradient Function\n",
    "\n",
    "To use gradient descent we need to take the partial derivative of the loss function with respect to each of the weights. We could manually compute the derivatives, but with Tensorflow's automatic differentiation capabilities we don't have to!\n",
    "\n",
    "During gradient descent we think of the loss as a function of the parameters $w_0$ and $w_1$. Thus, we want to compute the partial derivative with respect to these variables. The `params=[2,3]` argument tells TensorFlow to only compute derivatives with respect to the 2nd and 3rd arguments to the loss function (counting from 0, so really the 3rd and 4th)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Counting from 0, the 2nd and 3rd parameter to the loss function are our weights\n",
    "grad_f = tf.contrib.eager.gradients_function(f = loss_mse, params=[2,3])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### Training Loop\n",
    "\n",
    "Here we have a very simple training loop that converges. Note we are ignoring best practices like batching, creating a separate test set, and random weight initialization for the sake of simplicity."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "STEPS = 1000\n",
    "LEARNING_RATE = .02\n",
    "\n",
    "# Initialize weights\n",
    "w0 = tf.constant(value = 0.0, dtype = tf.float32)\n",
    "w1 = tf.constant(value = 0.0, dtype = tf.float32)\n",
    "\n",
    "for step in range(STEPS):\n",
    "    #1. Calculate gradients\n",
    "    d_w0, d_w1 = grad_f(X, Y, w0, w1)\n",
    "\n",
    "    #2. Update weights\n",
    "    w0 = w0 - d_w0 * LEARNING_RATE\n",
    "    w1 = w1 - d_w1 * LEARNING_RATE\n",
    "\n",
    "    #3. Periodically print MSE\n",
    "    if step % 100 == 0:\n",
    "        print(\"STEP: {} MSE: {}\".format(step, loss_mse(X, Y, w0, w1)))\n",
    "\n",
    "# Print final MSE and weights\n",
    "print(\"STEP: {} MSE: {}\".format(STEPS,loss_mse(X, Y, w0, w1)))\n",
    "print(\"w0:{}\".format(round(float(w0), 4)))\n",
    "print(\"w1:{}\".format(round(float(w1), 4)))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Bonus"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Try modelling a non-linear function such as: $y=xe^{-x^2}$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "X = tf.constant(value = np.linspace(0,2,1000), dtype = tf.float32)\n",
    "Y = X*np.exp(-X**2) * X\n",
    "\n",
    "from matplotlib import pyplot as plt\n",
    "%matplotlib inline\n",
    "plt.plot(X, Y)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def make_features(X):\n",
    "    features = [X]\n",
    "    features.append(tf.ones_like(X))  # Bias.\n",
    "    features.append(tf.square(X))\n",
    "    features.append(tf.sqrt(X))\n",
    "    features.append(tf.exp(X))\n",
    "    return tf.stack(features, axis=1)\n",
    "\n",
    "def make_weights(n_weights):\n",
    "    W = [tf.constant(value = 0.0, dtype = tf.float32) for _ in range(n_weights)]\n",
    "    return tf.expand_dims(tf.stack(W),-1)\n",
    "\n",
    "def predict(X, W):\n",
    "    Y_hat = tf.matmul(X, W)\n",
    "    return tf.squeeze(Y_hat, axis=-1)\n",
    "\n",
    "def loss_mse(X, Y, W):\n",
    "    Y_hat = predict(X, W)\n",
    "    return tf.reduce_mean(input_tensor = (Y_hat - Y)**2)\n",
    "\n",
    "X = tf.constant(value = np.linspace(0,2,1000), dtype = tf.float32)\n",
    "Y = np.exp(-X**2) * X\n",
    "\n",
    "grad_f = tf.contrib.eager.gradients_function(f = loss_mse, params=[2])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "STEPS = 2000\n",
    "LEARNING_RATE = .02\n",
    "\n",
    "# Weights/features.\n",
    "Xf = make_features(X)\n",
    "# Xf = Xf[:,0:2]  # Linear features only.\n",
    "W = make_weights(Xf.get_shape()[1].value)\n",
    "\n",
    "# For plotting\n",
    "steps = []\n",
    "losses = []\n",
    "\n",
    "plt.figure()\n",
    "for step in range(STEPS):\n",
    "    #1. Calculate gradients\n",
    "    dW = grad_f(Xf, Y, W)[0]\n",
    "    #2. Update weights\n",
    "    W -= dW * LEARNING_RATE\n",
    "    #3. Periodically print MSE\n",
    "    if step % 100 == 0:\n",
    "        loss = loss_mse(Xf, Y, W)\n",
    "        steps.append(step)\n",
    "        losses.append(loss)\n",
    "        plt.clf()\n",
    "        plt.plot(steps, losses)\n",
    "# Print final MSE and weights\n",
    "print(\"STEP: {} MSE: {}\".format(STEPS,loss_mse(Xf, Y, W)))\n",
    "\n",
    "# Plot results\n",
    "plt.figure()\n",
    "plt.plot(X, Y, label='actual')\n",
    "plt.plot(X, predict(Xf, W), label='predicted')\n",
    "plt.legend()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Copyright 2019 Google Inc. Licensed under the Apache License, Version 2.0 (the \"License\"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.4"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
